Machine Learning Systems Design

Note

Most of the notes are from the book Machine Learning Systems Design

Overview of Machine Learning Systems Design

Pasted image 20230622120707.png|500

MLOps

What is MLOps (Machine Learning Operations)

Requirements for ML Systems

Iterative Process of Developing an ML system

1. Project scoping

2. Data engineering

Data pipeline in MLOps

Major types of data problems

Data pipeline at difference phases

Best practice

3. ML model development

Coping with ML training challenges

Checkpointing

Distributed training strategies

-> scale challenges: increased training data volume or increased model size and complexity

Best practice: Sanity-check test

4. ML Model Deployment

5. ML System Monitoring and Continual learning

6. Business analysis

Infrastructure and Tooling for MLOps

Infrastructure

Pasted image 20230714122416.png|400

Storage & Compute

Development Environment

Resource Management

Pipeline Orchestration

orchestration allows managing end to end traceability of pipeline using automation to capture specific inputs, outputs, and artifacts of a given task.

Model Integration

= integrating models with ML applications

Human-in-the-Loop Pipelines

Human review of model predictions

Pasted image 20231004110235.png|500